4 research outputs found

    Performance and Energy Optimization of the Iterative Solution of Sparse Linear Systems on Multicore Processors

    Get PDF
    En esta tesis doctoral se aborda la solución de sistemas dispersos de ecuaciones lineales utilizando métodos iterativos precondicionados basados en subespacios de Krylov. En concreto, se centra en ILUPACK, una biblioteca que implementa precondicionadores de tipo ILU multinivel para la solución eficiente de sistemas lineales dispersos. El incremento en el número de ecuaciones, y la aparición de nuevas arquitecturas, motiva el desarrollo de una versión paralela de ILUPACK que optimice tanto el tiempo de ejecución como el consumo energético en arquitecturas multinúcleo actuales y en clusters de nodos construidos con esta tecnología. El objetivo principal de la tesis es el diseño, implementación y valuación de resolutores paralelos energéticamente eficientes para sistemas lineales dispersos orientados a procesadores multinúcleo así como aceleradores hardware como el Intel Xeon Phi. Para lograr este objetivo, se aprovecha el paralelismo de tareas mediante OmpSs y MPI, y se desarrolla un entorno automático para detectar ineficiencias energéticas.In this dissertation we target the solution of large sparse systems of linear equations using preconditioned iterative methods based on Krylov subspaces. Specifically, we focus on ILUPACK, a library that offers multi-level ILU preconditioners for the effective solution of sparse linear systems. The increase of the number of equations and the introduction of new HPC architectures motivates us to develop a parallel version of ILUPACK which optimizes both execution time and energy consumption on current multicore architectures and clusters of nodes built from this type of technology. Thus, the main goal of this thesis is the design, implementation and evaluation of parallel and energy-efficient iterative sparse linear system solvers for multicore processors as well as recent manycore accelerators such as the Intel Xeon Phi. To fulfill the general objective, we optimize ILUPACK exploiting task parallelism via OmpSs and MPI, and also develope an automatic framework to detect energy inefficiencies

    Convolutional neural nets for estimating the run time and energy consumption of the sparse matrix-vector product

    Get PDF
    Modeling the performance and energy consumption of the sparse matrix-vector product (SpMV) is essential to perform off-line analysis and, for example, choose a target computer architecture that delivers the best performance-energy consumption ratio. However, this task is especially complex given the memory-bounded nature and irregular memory accesses of the SpMV, mainly dictated by the input sparse matrix. In this paper, we propose a Machine Learning (ML)-driven approach that leverages Convolutional Neural Networks (CNNs) to provide accurate estimations of the performance and energy consumption of the SpMV kernel. The proposed CNN-based models use a blockwise approach to make the CNN architecture independent of the matrix size. These models are trained to estimate execution time as well as total, package, and DRAM energy consumption at different processor frequencies. The experimental results reveal that the overall relative error ranges between 0.5% and 14%, while at matrix level is not superior to 10%. To demonstrate the applicability and accuracy of the SpMV CNN-based models, this study is complemented with an ad-hoc time-energy model for the PageRank algorithm, a popular algorithm for web information retrieval used by search engines, which internally realizes the SpMV kernel

    Harvesting Energy in ILUPACK via Slack Elimination

    Get PDF
    Ponència presentada al 2nd Workshop on Power-Aware Computing (PACO 2017) Ringberg Castle, Germany, July, 5-8 2017We develop a new energy-aware methodology to improve the energy consumption of a task-parallel preconditioned Conjugate Gradient iter- ative solver on a Haswell-EP Intel Xeon. This technique leverages the power-saving modes of the processor and the frequency range of the userspace Linux governor, modifying the CPU frequency for some oper- ations. We demonstrate that its application during the main operations of the PCG solver can reduce its energy consumption

    Adapting concurrency throttling and voltage–frequency scaling for dense eigensolvers

    Get PDF
    We analyze power dissipation and energy consumption during the execution of high-performance dense linear algebra kernels on multi-core processors. On top of this analysis, we propose and evaluate several strategies to adapt concurrency throttling and the voltage–frequency setting in order to obtain an energy-efficient execution of LAPACK’s routine dsytrd. Our strategies take into account the differences between the memory-bound and CPU-bound kernels that govern this routine, and whether problem data fits into the processor’s last level cache.This work was supported by the CICYT Project TIN2011-23283 of MINECO and FEDER, the EU Project FP7 318793 “EXA2GREEN”, and the FPU program of the Ministerio de Educación, Cultura y Deporte
    corecore